Pipeline Processor


Q31.

A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The number of Read-After-Write (RAW) dependencies, Write-After-Read( WAR) dependencies, and Write-After-Write (WAW) dependencies in the sequence of instructions are, respectively,
GateOverflow

Q32.

A 5-stage pipelined processor has Instruction Fetch (IF), Instruction Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Write Operand (WO) stages. The IF, ID, OF and WO stages take 1 clock cycle each for any instruction. The PO stage takes 1 clock cycle for ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6 clock cycles for DIV instruction respectively. Operand forwarding is used in the pipeline. What is the number of clock cycles needed to execute the following sequence of instructions?
GateOverflow

Q33.

A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The IF, ID and WB stages take 1 clock cycle each. The EX stage takes 1 clock cycle each for the ADD, SUB and STORE operations, and 3 clock cycles each for MUL and DIV operations. Operand forwarding from the EX stage to the ID stage is used. The number of clock cycles required to complete the sequence of instructions is
GateOverflow

Q34.

Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD \leftarrowR2 R7 +R8 I2 : SUB R4 \leftarrowR5 - R6 I3: ADD R1 \leftarrow R2 + R3 I4 : STORE Memory [R4] \leftarrow R1 BRANCH to Label if R1==0 Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any other program modification?
GateOverflow

Q35.

A non pipelined single cycle processor operating at 100 MHz is converted into a synchronous pipelined processor with five stages requiring 2.5 nsec, 1.5 nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The delay of the latches is 0.5 nsec. The speedup of the pipeline processor for a large number of instructions is:
GateOverflow

Q36.

Which of the following are NOT true in a pipelined processor? I. Bypassing can handle all RAW hazards II. Register renaming can eliminate all register carried WAR hazards III. Control hazard penalties can be eliminated by dynamic branch prediction
GateOverflow

Q37.

A processor takes 12 cycles to complete an instruction I. The corresponding pipelined processor uses 6 stages with the execution times of 3,2,5,4,6 and 2 cycles respectively. What is the asymptotic speedup assuming that a very large number of instructions are to be executed?
GateOverflow

Q38.

Consider a 4 stage pipeline processor. The number of cycles needed by the four instructions I1, I2, I3, I4 in stages S1, S2, S3, S4 is shown below: What is the number of cycles needed to execute the following loop? For (i=1 to 2) {I1; I2; I3; I4;}
GateOverflow

Q39.

Consider a 6-stage instruction pipeline, where all stages are perfectly balanced. Assume that there is no cycle-time overhead of pipelining. When an application is executing on this 6-stage pipeline, the speedup achieved with respect to non-pipelined execution if 25% of the instructions incur 2 pipeline stall cycles is _________.
GateOverflow

Q40.

A CPU has five-stages pipeline and runs at 1GHz frequency. Instruction fetch happens in the first stage of the pipeline. A conditional branch instruction computes the target address and evaluates the condition in the third stage of the pipeline. The processor stops fetching new instructions following a conditional branch until the branch outcome is known. A program executes 10^{9} instructions out of which 20% are conditional branches. If each instruction takes one cycle to complete on average, then total execution time of the program is
GateOverflow